Selecting Locking Primitives for Parallel Programs

نویسنده

  • Paul E. McKenney
چکیده

The only reason to parallelize a program is to gain performance. However, the synchronization primitives used by parallel programs can consume execessive memory bandwidths, can be subject to memory laten-cies, consume excessive memory, and result in unfair access or even starvation. These problems can overwhelm the performance beneets of parallel execution. Therefore, it is necessary to understand these performance implications of synchronization primitives in addition to their correctness, liveness, and safety properties. This paper presents a pattern language to assist you in selecting synchronization primitives for parallel programs. This pattern language assumes you have already chosen a locking design, perhaps by using a locking design pattern language McK96]. 1 Overview A lock-based parallel program uses synchronization primitives to deene critical sections of code in which only one CPU or thread may execute concurrently. For example, Figure 1 presents a fragment of parallel code to search and update a linear list. In this C-code example, the lt next eld links the individual elements together, the lt key eld contains the search key, and the lt data eld contains the data corresponding to that key. The section of code between the S LOCK() and the S UNLOCK() primitives is a critical section. Only one CPU at a time may be executing in this critical section. A poor choice of locking primitive can result in excessive overhead and poor performance under heavy load. The pattern language in this paper will help you determine what kind of locking primitive to use. This paper considers a few straightforward test-and-set, queued, and reader/writer locks, which will handle most situations. This paper presents the implementation level counterpart to a locking design pattern language McK96]. Section 2 therefore gives an overview of locking design patterns. Section 3 describes the forces common to all of the patterns. Section 4 overviews contexts in which these patterns are useful. Section 5 presents several indexes to the patterns. Section 6 presents the patterns themselves. Although design and implementation are often treated as separate activities, they are almost always deeply intertwined. Therefore, this section presents a brief overview of design-level patterns and the forces that act on them. 2.1 Overview of Locking Design Patterns This paper refers to the following locking design patterns: Sequential Program: A design with no parallelism, ooering none of the beneets or problems associated with parallel programs. Code Locking: A design where locks are associated with speciic sections of …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Operating System Abstractions: a Case Study of Multiprocessor Locks

Operating system kernels typically ooer a xed and limited set of primitives and underlying mechanisms for use by application programs. However, the attainment of high performance for a variety of parallel applications may require the availability of additional primitives or of variants of existing primitives best suited for speciic applications. Furthermore, operating system mechanisms must als...

متن کامل

Performance of Locking Primitives at Low Levels of Contention

There has been much work done modeling, simulating, and measuring the performance of locking primitives under high levels of contention. However, an important key to producing high-performance parallel programs is to maintain extremely low levels of contention. Despite its importance, the low-contention regime has been largely neglected. In order to fill this gap, this paper analyzes the perfor...

متن کامل

Efficient Synchronization: Let Them Eat QOLB1

Efficient synchronization primitives are essential for achieving high performance in fine-grain, shared-memory parallel programs. One function of synchronization primitives is to enable exclusive access to shared data and critical sections of code. This paper makes three contributions. (1) We enumerate the five sources of overhead that locking synchronization primitives can incur. (2) We descri...

متن کامل

Parleda: a Library for Parallel Processing in Computational Geometry Applications

ParLeda is a software library that provides the basic primitives needed for parallel implementation of computational geometry applications. It can also be used in implementing a parallel application that uses geometric data structures. The parallel model that we use is based on a new heterogeneous parallel model named HBSP, which is based on BSP and is introduced here. ParLeda uses two main lib...

متن کامل

Selecting Locking Designs for Parallel Programs

Parallelizing a program can greatly speed it up. However , the synchronization primitives needed to protect shared resources result in contention, overhead, and added complexity. These costs can overwhelm the performance beneets of parallel execution. Since the only reason to go to the trouble of parallelizing a program is to gain performance, it is necessary to understand the performance impli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996